83 research outputs found
Estimating Mixture Entropy with Pairwise Distances
Mixture distributions arise in many parametric and non-parametric settings --
for example, in Gaussian mixture models and in non-parametric estimation. It is
often necessary to compute the entropy of a mixture, but, in most cases, this
quantity has no closed-form expression, making some form of approximation
necessary. We propose a family of estimators based on a pairwise distance
function between mixture components, and show that this estimator class has
many attractive properties. For many distributions of interest, the proposed
estimators are efficient to compute, differentiable in the mixture parameters,
and become exact when the mixture components are clustered. We prove this
family includes lower and upper bounds on the mixture entropy. The Chernoff
-divergence gives a lower bound when chosen as the distance function,
with the Bhattacharyya distance providing the tightest lower bound for
components that are symmetric and members of a location family. The
Kullback-Leibler divergence gives an upper bound when used as the distance
function. We provide closed-form expressions of these bounds for mixtures of
Gaussians, and discuss their applications to the estimation of mutual
information. We then demonstrate that our bounds are significantly tighter than
well-known existing bounds using numeric simulations. This estimator class is
very useful in optimization problems involving maximization/minimization of
entropy and mutual information, such as MaxEnt and rate distortion problems.Comment: Corrects several errata in published version, in particular in
Section V (bounds on mutual information
Nonlinear Information Bottleneck
Information bottleneck (IB) is a technique for extracting information in one
random variable that is relevant for predicting another random variable
. IB works by encoding in a compressed "bottleneck" random variable
from which can be accurately decoded. However, finding the optimal
bottleneck variable involves a difficult optimization problem, which until
recently has been considered for only two limited cases: discrete and
with small state spaces, and continuous and with a Gaussian joint
distribution (in which case optimal encoding and decoding maps are linear). We
propose a method for performing IB on arbitrarily-distributed discrete and/or
continuous and , while allowing for nonlinear encoding and decoding
maps. Our approach relies on a novel non-parametric upper bound for mutual
information. We describe how to implement our method using neural networks. We
then show that it achieves better performance than the recently-proposed
"variational IB" method on several real-world datasets
Caveats for information bottleneck in deterministic scenarios
Information bottleneck (IB) is a method for extracting information from one
random variable that is relevant for predicting another random variable
. To do so, IB identifies an intermediate "bottleneck" variable that has
low mutual information and high mutual information . The "IB
curve" characterizes the set of bottleneck variables that achieve maximal
for a given , and is typically explored by maximizing the "IB
Lagrangian", . In some cases, is a deterministic
function of , including many classification problems in supervised learning
where the output class is a deterministic function of the input . We
demonstrate three caveats when using IB in any situation where is a
deterministic function of : (1) the IB curve cannot be recovered by
maximizing the IB Lagrangian for different values of ; (2) there are
"uninteresting" trivial solutions at all points of the IB curve; and (3) for
multi-layer classifiers that achieve low prediction error, different layers
cannot exhibit a strict trade-off between compression and prediction, contrary
to a recent proposal. We also show that when is a small perturbation away
from being a deterministic function of , these three caveats arise in an
approximate way. To address problem (1), we propose a functional that, unlike
the IB Lagrangian, can recover the IB curve in all cases. We demonstrate the
three caveats on the MNIST dataset
Survival outcome according to KRAS mutation status in newly diagnosed patients with stage IV non-small cell lung cancer treated with platinum doublet chemotherapy
INTRODUCTION: Mutations (MT) of the KRAS gene are the most common mutation in non-small cell lung cancer (NSCLC), seen in about 20-25% of all adenocarcinomas. Effect of KRAS MT on response to cytotoxic chemotherapy is unclear.
METHODS: We undertook a single-institution retrospective analysis of 93 consecutive patients with stage IV NSCLC adenocarcinoma with known KRAS and EGFR MT status to determine the association of KRAS MT with survival. All patients were treated between January 1, 2008 and December 31, 2011 with standard platinum based chemotherapy at the University of Pennsylvania. Overall and progression free survival were analyzed using Kaplan-Meier and Cox proportional hazard methods.
RESULTS: All patients in this series received platinum doublet chemotherapy, and 42 (45%) received bevacizumab. Overall survival and progression free survival for patients with KRAS MT was no worse than for patients with wild type KRAS. Median overall survival for patients with KRAS MT was 19 months (mo) vs. 15.6 mo for KRAS WT, p = 0.34, and progression-free survival was 6.2 mo in patients with KRAS MT vs. 7mo in patients with KRAS WT, p = 0.51. In multivariable analysis including age, race, gender, and ECOG PS, KRAS MT was not associated with overall survival (HR 1.12, 95% CI 0.58-2.16, p = 0.74) or progression free survival (HR 0.80, 95% CI 0.48-1.34, p = 41). Of note, receipt of bevacizumab was associated with improved overall survival only in KRAS WT patients (HR 0.34, p = 0.01).
CONCLUSIONS: KRAS MT are not associated with inferior progression-free and overall survival in advanced NSCLC patients treated with standard first-line platinum-based chemotherapy
Towards practical reinforcement learning for tokamak magnetic control
Reinforcement learning (RL) has shown promising results for real-time control
systems, including the domain of plasma magnetic control. However, there are
still significant drawbacks compared to traditional feedback control approaches
for magnetic confinement. In this work, we address key drawbacks of the RL
method; achieving higher control accuracy for desired plasma properties,
reducing the steady-state error, and decreasing the required time to learn new
tasks. We build on top of \cite{degrave2022magnetic}, and present algorithmic
improvements to the agent architecture and training procedure. We present
simulation results that show up to 65\% improvement in shape accuracy, achieve
substantial reduction in the long-term bias of the plasma current, and
additionally reduce the training time required to learn new tasks by a factor
of 3 or more. We present new experiments using the upgraded RL-based
controllers on the TCV tokamak, which validate the simulation results achieved,
and point the way towards routinely achieving accurate discharges using the RL
approach
Recommended from our members
Dynamics of beneficial epidemics
Pathogens can spread epidemically through populations. Beneficial contagions, such as viruses that enhance host survival or technological innovations that improve quality of life, also have the potential to spread epidemically. How do the dynamics of beneficial biological and social epidemics differ from those of detrimental epidemics? We investigate this question using a breadth-first modeling approach involving three distinct theoretical models. First, in the context of population genetics, we show that a horizontally-transmissible element that increases fitness, such as viral DNA, spreads superexponentially through a population, more quickly than a beneficial mutation. Second, in the context of behavioral epidemiology, we show that infections that cause increased connectivity lead to superexponential fixation in the population. Third, in the context of dynamic social networks, we find that preferences for increased global infection accelerate spread and produce superexponential fixation, but preferences for local assortativity halt epidemics by disconnecting the infected from the susceptible. We conclude that the dynamics of beneficial biological and social epidemics are characterized by the rapid spread of beneficial elements, which is facilitated in biological systems by horizontal transmission and in social systems by active spreading behavior of infected individuals
Age at first birth in women is genetically associated with increased risk of schizophrenia
Prof. Paunio on PGC:n jäsenPrevious studies have shown an increased risk for mental health problems in children born to both younger and older parents compared to children of average-aged parents. We previously used a novel design to reveal a latent mechanism of genetic association between schizophrenia and age at first birth in women (AFB). Here, we use independent data from the UK Biobank (N = 38,892) to replicate the finding of an association between predicted genetic risk of schizophrenia and AFB in women, and to estimate the genetic correlation between schizophrenia and AFB in women stratified into younger and older groups. We find evidence for an association between predicted genetic risk of schizophrenia and AFB in women (P-value = 1.12E-05), and we show genetic heterogeneity between younger and older AFB groups (P-value = 3.45E-03). The genetic correlation between schizophrenia and AFB in the younger AFB group is -0.16 (SE = 0.04) while that between schizophrenia and AFB in the older AFB group is 0.14 (SE = 0.08). Our results suggest that early, and perhaps also late, age at first birth in women is associated with increased genetic risk for schizophrenia in the UK Biobank sample. These findings contribute new insights into factors contributing to the complex bio-social risk architecture underpinning the association between parental age and offspring mental health.Peer reviewe
- …